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Abstract — Nowadays, it is interesting issue to schedule large 
number of BoTs over heterogeneous systems like Hybrid Cloud. 
If problem is NP- Complete then it is more complex to process in 
parallel it over that heterogeneous system and get optimal 
solution. Here, it proposes a Scheduling algorithm which gives a 
better solution compare to other algorithms. Algorithm is 
beneficial to compute better results while considering parameters 
such as, execution time, bandwidth and storage requirement. It 
gives aggregate result for given problems.In case of NP-complete 
problem, it is curious issue to schedule large scale parallel 
computing applications on heterogeneous systems like Hybrid 
cloud. End user want to meet Quality of Service requirement 
(QoS).To process huge number of Bag of-Task (BOT) 
concurrently in such environment with QOS is a Big problem. 
For that it needs a exact solution. It addresses the scheduling 
problem of large-scale applications inspired from real-world, 
divided by a large number of homogeneous and concurrent 
multiple-of-tasks that are the main sources of bottlenecks but 
open great potential for accumulation.Here method proposes 
Multi objective scheduling algorithm to schedule BOTs. Scientific 
application such as NP Complete problem which takes more time 
to find the results. Here MOGT Scheduling algorithm proposed 
in such a way that it optimize the schedule i.e computation time 
while dividing the Bag of task(BoTs) into Sub Bag of 
Task(SubBoTs)means it results in increase in parallel 
computation and find the result in limited number of steps. 

Keywords: Multi-objective scheduling, Bags-of-tasks, Hybrid 
clouds, NP Complete. 


INTRODUCTION 
Dissertation Idea 

Use of Multiobjective game theoretic Scheduling 
algorithm for execution of SubBoT takes less time as 
compared to BoT. Here it consider the objective as 
computation time while considering constraint as Storage and 
Bandwidth. 

Increase in parallelism after dividation there is increasing 
parallelism of BoTs in SubBoTs gives faster updates which 
could be possible by Scheduling algorithm and depend on 
those update scheduler can take a decision in less time. In this 
scheduling ,it executes the tasks faster where the processor 
utilization will be more. 


More will be the parallel computation more resource power it 
will need. For that it requires the hybrid cloud resources to 
have a better performance for NP Complete application. 
Computing systems such as clouds have evolved over decade 
to support various types of scientific applications with 
dependable, consistent, pervasive and inexpensive access to 
geographically-seperated high-end computational capabilities. 
To program such a large and scalable infrastructure like 
weakly combine-based coordination models of legacy 
software components such as bags-of-tasks (BoT), hybrid 
clouds and work flows have emerged as one of the most 
successful programming paradigms in the scientific 
community. 

From the end user's perspective, minimizing execution time 
are preferred func-tionalities, whereas from the systems 
perspective system-level efficiency and [l]fair-ness can be 
considered as a good motivation such that the applications 
with large scale computation should be allocated with more 
resources. Currently, only a few schemes can deal with both 
perspectives, such as optimizing user objectives makespan 
while fulfilling other constraints and providing a good 
efficiency and fairness to all users. While, many applications 
can generate huge data sets in a relatively short time, such as 
the Large Hadron Collider produced 5-6 pb of data per year, 
which must be accommodated and efficiently handled through 
suitable scheduling bandwidth and storage constraints. 

Increase Parallel Computation in Scheduling There are many 
existing Multi- Objective Evolutionary Algorithms such as 
Multiobjective Genetic Algorithm (MOGA) and Multi 
objective Evolutionary Pro-gramming (MOEP) which work on 
makespan minimization having high complexity as compared 
to the Multiobjective Game Theoretic Scheduling -MOGTS 
Algo-rithm. 

Here the motivation is that to schedule Bag of Task effectively 
which can compute the BoT in minimum makespan[6] while 
considering objectives such as Makespan, Storage and 
Bandwidth.As a part of more parallelism increase in mul- 
tithreadingwhich can result in faster task execution. 

* Assumption: It is to assume that as the parallel computation 
increase ,it need more hardware resources. 
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Figure 1.1: BoT System 



Figure 1.2: Proposed SubBoTSystem 


USE OF CLOUD PLATFORM 
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constraint based scheduling, which minimizes execution time 
while meeting a specified budget for delivering results. A new 
type of genetic algorithm is developed to solve the scheduling 
optimization problem and they test the scheduling algorithm in 
a simulated Grid testbed. 

2. I. Stephie Rachel, Joshua Samuel Raj, V. Vasudevan 
says, [4] 

The application system while executing in a Grid environment 
may encounter a failure. This phenomenon can be overcome 
by a reliable scheduler who plays the major role of allocating 
the applications to the reliable resources based on the 
reliability requirement of the applications given by the users. 
The reliability requirement considered in this paper is deadline 
and budget which is also the quality of service requirement 
needed for the applications. In this paper, based on deadline 
and budget as a main factor the tasks are scheduled to the 
reliable processors. 

3. DanielGrosu, Anthony T. Chronopoulos, Ming-Ying Leung 
says, [5] 

In this paper it formulate the static load balancing problem in 
single class job distributed systems as a cooperative game 
among computers. It is shown that the Nash Bargaining 
Solution (NBS) provides a Pareto optimal allocation which is 
also fair to all jobs. It proposes a cooperative load balancing 
game and present the structure of the NBS. For this game an 
algorithm for computing NBS is de-rived. which shows that 
the fairness index is always 1 using NBS which means that the 
allocation is fair to all jobs. Finally, the performance of our 
cooperative load balancing scheme is compared with that of 
other existing schemes. 

4. PreetamGhos says, [6] 


Previously many scientific applications use Grid Computing 
services where resources are distributed over network due to 
that it took much time to aggregate result. To avoid this 
drawback here it is feasible to use cloud based services where 
the resources are arranged in one server rack and user can 
have a remote access. 

Here in case , if private resources could not perform better for 
heavy computation that time there is a need to access paid 
services. 

LITERATURE SURVEY 

l.Jia Yu and RajkumarBuyya says, [3] 

Over the last few years, Grid technologies have progressed 
towards a service-oriented paradigm that enables a new way of 
service provisioning based on utility computing models. Users 
consume these services based on their QoS (Quality of 
Service) requirements. In such pay-per-use Grids, work flow 
execution cost must be considered during scheduling based on 
user's QoS constraints. In this paper, it propose a budget 


This paper proposes cost-optimal job allocation schemes based 
on a fair pricing strategy for distributed systems where the 
nodes can have bandwidth constraints and, subsequently, 
might encounter high communication delays in job transfer. A 
job allocator receives discrete, serial batch jobs from the users 
and assigns them to heterogeneous nodes for completion. 
Todays distributed computing systems incorporate different 
types of nodes with varied bandwidth constraints which 
should be considered while designing cost-optimal job 
allocation schemes for better system performance. This paper 
proposes a fair pricing strategy for job allocation in 
bandwidth-constrained distributed systems. The strategy 
formulates an incomplete information, alternating-offers 
bargaining game on two variables, such as price per unit 
resource and percentage of bandwidth allocated, for both 
single and multiclass jobs at each node. It present a cost- 
optimal job allocation scheme for single-class jobs that 
involve communication delay and, hence, the link bandwidth. 
For fast and adaptive allocation of multiclass jobs, it describes 
three efficient heuristics and compare them under different 
network scenarios. The results show that the proposed 
algorithms are comparable to existing job allocation schemes 
in terms of the expected system response time over all jobs. 
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5.L.F. Bittencourt, E.R.M. Madeira, and N.L.S.D. Fonseca 
says, [7] 

Schedulers for cloud computing determine on which 
processing resource jobs of a work ow should be allocated. In 
hybrid clouds, jobs can be allocated either on a private cloud 
or on a public cloud on a pay per use basis. The capacity of the 
communication channels connecting these two types of 
resources impact the makespan and the cost of work flows 
execution. This paper introduces the schedul-ing problem in 
hybrid clouds presenting the main characteristics to be 
considered when scheduling work flows, as well as a brief 
survey of some of the scheduling al-gorithms used in these 
systems. To assess the influence of communication channels 
on job allocation, we compare and evaluate the impact of the 
available bandwidth on the performance of some of the 
scheduling algorithms. 

b.Benoit, A. Marchal, L. Pineau, J.F. Robert, Y. Vivien, F. 
says, [8] 

Scheduling problems are already di cult on traditional parallel 
machines. They become extremely challenging on 
heterogeneous clusters, even when embar-ras singly parallel 
applications are considered.For instance, consider a bag-of- 
tasks application i.e., an application made of a collection of 
independent and identical tasks, to be scheduled on a master 
worker platform. On the contrary, if the plat-form gathers 
heterogeneous processors, connected to the master via 
different-speed links, then the previous strategies are likely to 
fail dramatically. This is because it is crucial to select which 
resources to enroll before initiating the computation. 

In this paper, they still target fully parallel applications, but 
they introduce a much more complex (and more realistic) 
framework than scheduling a single application.Theyenvision 
a situation where users, or clients, submit several bag-of-tasks 
applications to a heterogeneous master- worker platform, using 
a classical clientserver model. Applications are submitted 
online, which means that there is no a priori (static) 
knowledge of the workload distribution at the beginning of the 
execution. When several applications are executed 
simultaneously, they compete for hardware as network and 
CPU resources. 

7.RaduProdan, Simon Ostermann says, [9] 

Achieving high-performance in Grid environments is often 
approached in the community as a pure scheduling problem 
that maps application components also called services or 
activities onto distributed resources. For certain objective 
functions such as execution time, this is an NP-complete 
optimization problem that fascinates researchers for 
discovering advanced heuristics that find good solutions. The 
validation performed is often limited to simulation based on 
synthetic applications that uses workload information 
collected from some real Grid traces. 



Figure 1.3: RP-GD common pattern search 

RP-GD algorithm: RP-GD mines a representative 
set directly from graph databases. Jianzhong Li, Yong Liu, 
and Hong Gao adopt the idea of online algorithm to devise 
RP-GD. Whenever some frequent sub graph mining algorithm 
generates a frequent sub graph, RP-GD attempt to discover 
representative R from the current representative set RS such 
that R can cover P where p in any close 

frequentsubgraphand R is one of the representative pattern 
from set of representative patterns RS. When there exists no 
representative in RS that can cover P, build a new 
representative Rnew that can cover P using some greedy 
strategies, and put Rnew that is newly discovered 
representative into RS. Thus, RP-GD can derive a 
representative set by scanning closed frequent sub graphs 
once. 


System Architecture 

System architecture of our system is as shown in figure 5.3. 
How it works? 



Application Model: 


l.SubmitBoT through message dispatcher to private cloud. 
There services as, 
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*Provision Services 

^Membership service 
* Scheduling service 

2.Need to schedule BoT so apply MOGTS algorithm. Here, 

1] It states that, to minimize Execution time of all application 
3. If in case extra resources required for better performance F(x), 

and faster computation then it request for resources to public 

cloud. Minimize F(x) = (f(x), c(x)), 


4.Provision. 


s.t hi(x) <lx, i, ie[l, M] gi(x) < sf; iG[l,M] xES 


5. Pro vision of extra resources is done. 
b.Gateway for public cloud "Start VM's". 

7. It starts the VM's and join the network. 

8. Register the membership. 

9. After registration worker node at public cloud are ready to 
work now. 


Here, the concept is that after dividing BoT into SubBoT we 
need to have powerful resources for better results in parallel 
computation.This is the only purpose to use Hybrid cloud 
service. 


MATHMATICAL MODULE 


2] Matrix which delivers the expected execution time 
Pki of task in each BOT 


kE[l, k] 

Pki = lpc ki ,pc ki }> poki; 

Pki = {pc ki + (pOki-pCki) 1= po ki pc ki <po ki 


3] where, 

POki ~ dki/bki 


4] Data bandwidth can be calculated as, k x> i < YX=i Gki : 

h ki=n _i.^i 

* 1 poki 


5] The objective of each manager is to minimize execution 
time while fulfilling storage and bandwidth constraint, 

expressed as 

5/c _ Sk 
tk(A) ~Tk~ y m 

^i=i P ki 


In Multi Objective Game Theory Scheduling, 

Here it Propose a reliable System which gives a best solution. 
It focuses mainly on four Constraints such as, 

1. Execution Time 

2. Network Bandwidth 

3. Storage Requirement 

Set Theory: 

S=I,0 Where I is the input contains 
I = (AS, K, M, mi, pki, c tK , E) 

Require:/?/, bn 
Compute: 6 s(l) , e sm 
Where O is the output: 

0=(Result of NP complete problem) 


i /a nx Okidki ^ * . 

hi(A, /?) =/ 

Z^ k=1 pki 

gi(A)=£k=i srk * Oki < sli 


6] Resource allocation of BOTs on, 

Y*k=i Ski*pki*coki 


7] Weight factor o^for BOTs Tk, 


miniE[l,M]Pki 1 

n/ Pki — pki 

fJUJKl minie[l,M]Pki V M 

L k= l p k i fe “ i y 


Here, 

• n=Application 

• AS=Set of Application 

• K=No. OfBoTs 

• M=No. of Sited 

• mi=No. of Processors on site s i? /E[ 1, k] 

• pk t ( m * n) : ETC matrix 

• £ k =Number of Tasks of BoTsk(kE[l, m]) 

• bli=band width limit of site Si,(iE[l,m]) 

• Tk - (kE[l, m]) 

• £ s(1) =Task distribution matrix 

• 0 s(1) =Resource Allocation Matrix. 

Ai=Makespan of application where A/, iE[l,n\ is 
maximum completion of its BoTs. 


pki 

bcoki=-f% 

L k=idki 

scold = 

L k=i sr ki 

8] Makespan can be expressed as, aggregated execution 
time divided by number of processors; 

argmin(2fc =1 /7c(A) 

9] For initial step of matrix, BoT considers bandwidth 
and processors available, 


176 


www.erpublication.org 


mi 


Ski = 8k 


pki 

y >M m i 
k=1 pki 


b ki =lx,i- 


dki 

k=i 


10] After initial step this matrix will continue, 

e s(1) =0(A s(1 ' 1) ),A s(1 -(Ae s(1) ) 
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Set Theory: 

Morphism 

A morphism is a map between two objects in an abstract 
category. 



1) A general morphism is called a homomorphism, 

2) A morphism f:Y— >X in a category is a monomorphism 
if, for any two morphismsu, 

v:Z— >Y, fu=fv implies that u=v, 

3) A morphism f:Y— >X in a category is an epimorphism 
if, for any two morphismsu, 

v:X— >Z, uf=vf implies u=v, 

4) A bijectivemorphism is called an isomorphism (if there 
is an isomorphism between two objects, then we say 
they are isomorphic), 

5) A surjectivemorphism from an object to itself is called 
an endomorphism, and 

6) An isomorphism between an object and itself is calledan 
automorphism. 
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CONCLUSION 

In proposed system, it uses Game Theoretic Scheduling 
algorithm for Sub-BoTs which efficiently schedules the BoTs 
over a cloud and works for NP Complete problem also achieve 
good solution for it. It work on all the parameter as makespan 
, storage and bandwidth which gives aggregate better results.lt 
gives exact solution for each BoTsto execute on specific site 
which is most suitable for them.Proposed system which 
divides BoTsinto a sub-BoTs as per 
assigned deadlines. As a future scope it can be implemented as 
a model for Astrological problems like scientific study of 
Galaxy or Scientific problems in Physics which takes lot of 
time for computation. 
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